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Abstract 

A g-query locally testable code (LTC) is an error correcting code that can be tested 
by a randomized algorithm that reads at most q symbols from the given word. An 
important question is whether there exist LTCs that have the property: constant 
relative rate, constant relative distance, and that can be tested with a constant number 
of queries. Such codes are sometimes referred to as "asymptotically good" . 

We show that dense LTCs cannot be c^. The density of a tester is roughly the 
average number of distinct local views in which a coordinate participates. An LTC is 
dense if it has a tester with density w(l). 

More precisely, we show that a 3-query locally testable code with a tester of density 
w(l) cannot be c^. Moreover, we show that a g-locally testable code {q > 3) with 
a tester of density uj{l)n'^~'^ cannot be c^. Our results hold when the tester has the 
following two properties: 

• (no weights:) Every g-tuple of queries occurs with the same probability. 

• ('last-one-fixed':) In every 'test' of the tester, the value to any g — 1 of the symbols 
determines the value of the last symbol. (Linear codes have constraints of this 
type). 

We also show that several natural ways to quantitatively improve our results would 
already resolve the general question, i.e. also for non-dense LTCs. 



1 Introduction 

An error correcting code is a set C C S". The rate of the code is log \C\ /n and its (relative) 
distance is the minimal Hamming distance between two different codewords x,y ^ C, divided 
by n. We only consider codes with distance 

A code is called locally testable with q queries if it has a tester, which is a randomized 
algorithm with oracle access to the received word x. The tester reads at most q symbols 
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from X and based on this local view decides if x G C or not. It should accept codewords with 
probability one, and reject words that are far (in Hamming distance) from the code with 
noticeable probability. The tester has parameters (r, e) if 

Vx G S", dist(x,C) > r =^ Pr[Tester rejects x\>e 

Locally Testable Codes (henceforth, LTCs) are studied extensively in recent years. A 
priori, even the existence of LTCs is not trivial. The Hadamard code is a celebrated example 
of an LTC, yet it is highly "inefficient" in the sense of having very low rate {\ogn/n). Starting 
with the work of Goldreich and Sudan [5] , several more efficient constructions of LTCs have 
been given. The best known rate for LTCs is 1/ log*^*-^-* n, and these codes have 3-query 
testers [SI IH E] ■ The failure to construct c^-LTCs leads to one of the main open questions in 
the area: are there LTCs that are c^, i.e. constant rate constant distance and testable with a 
constant number of queries (such codes are sometimes called in the literature "asymptotically 
good"). The case of two queries has been studied in [1]. However, the case of g > 3 is much 
more interesting and still quite open. 

Dense testers. In this work we make progress on a variant of the c' question. We show 
that LTCs with so-called dense testers, cannot be c' . 

The density of a tester is roughly the average number, per-coordinate, of distinct local 
views that involve that coordinate. More formally, every tester gives rise to a constraint- 
hypergraph H = [[n],E) whose vertices are the n coordinates of the codeword, and whose 
hyperedges correspond to all possible local views of the tester. Each hyperedge /i G -E is also 
associated with a constraint, i.e. with a Boolean function fh '-Tfl ^ {0, 1} that determines 
whether the tester accepts or rejects on that local view. For a given string x G S", we denote 
by Xh the substring obtained by restricting x to the coordinates in the hyperedge h. The 
value of fh{xh) determines if the string x falsifies the constraint or not. 

Definition 1.1 (The test-hypergraph, density). Let C C be a code, let g G N and e > 0. 

Let H he a. constraint hyper-graph with hyperedges of size at most q. H is an (e, r)-test- 
hyper graph for C if 

• For every x G C and every h G E, fh{xh) = 1 

• For every x G S", 

dist(x, C)>T Pr [fh{xh) = 0] > e 

fees 

where dist(x, y) denotes relative Hamming distance, i.e., the fraction of coordinates on 
which X differs from y. 

Finally, the density of H is simply the average degree, \E\ /n. 

The hypergraph is equivalent to a tester that selects one of the hyperedges uniformly at 
random. Observe that we disallow weights on the hyperedges. This will be discussed further 
below. 

Goldreich and Sudan [5] proved that every tester with density ^^(1) can be made into 
a "sparse" tester with density 0(1) by randomly eliminating each hyper-edge with suitable 
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probability. This means that a code can have both dense and sparse testers at the same 
time. Hence, we define a code to have density > c? if it has a tester with density d. In this 
work we show that the existence of certain dense testers restricts the rate of the code. 

We say that an LTC is sparse if it has no tester whose density is c<;(l). We do not know 
of any LTC that is sparse. Thus, our work here provides some explanation for the bounded 
rate that known LTCs achieve. 

In fact, one wonders whether density is an inherent property of LTCs. The intuition 
for such a claim is that in order to be locally testable the code seems to require a certain 
redundancy among the local tests, a redundancy which might be translated into density. If 
one were to prove that every LTC is dense, then it would rule out, by combination with our 
work, the existence of c^-LTCs. 

In support of this direction we point to the work of the second author with co-authors 
(Ben-Sasson et al 0) where it is shown that every linear LTC (even with bounded rate) 
must have some non-trivial density. I.e. they show that no linear LTC can be tested only 
with tests that from a basis to the dual code. Namely some constant density is required in 
every tester of an LTC. 

1.1 Our results 

We bound the rate of LTCs with dense testers. We only consider testers whose constraints 
have the "last-one-fixed" (LOF) property, i.e. that the value to any q — 1 symbols determine 
the value of the last symbol. Note for instance that any linear constraint has this property. 

We present different bounds for the case q = 3 and the case g > 3 where q denotes the 
number of queries. 

Theorem 1.2. Let C C {0, 1}" be a 3-query LTC with distance 5, and let H he an (5/3, e)- 
test-hypergraph with density d and LOF constraints. Then, the rate ofC is at most 0(1 / d^^'^) . 

For the case of g > 3 queries we have the following result 

Theorem 1.3. Let C C {0, 1}" he a q-query LTC with distance 5, and let H he an (5/2, e)- 
test-hypergraph with density A, where A = dn'^~'^, and LOF constraints. Then, the rate of 
C is at most 0{l/d). 

Extensions. In this preliminary version we assume that the alphabet is Boolean, but the 
results easily extend to any finite alphabet S. It may also be possible to get rid of the 
"last-one- fixed" restriction on the constraints, but this remains to be worked out. 

Improvements. We show that several natural ways of improving our results will already 
resolve the 'bigger' question of ruling out c^-LTCs. 

• In this work we only handle non-weighted testers, i.e., where the hyper-graph has no 
weights. In general a tester can put different weights on different hyperedges. This 
is sometimes natural when combining two or more "types" of tests each with certain 
probability. This limitation cannot be eliminated altogether, but may possibly be 
addressed via a more refined definition of density. See further discussion Section 14.31 



3 



• In Theorem II .21 we prove that p < 0{l/d^'^). We show that any improvement of the 0.5 
exponent (say to 0.5 + e) would again rule out the existence of c'^-LTCs, see Lemma Wl\ 

• In Theorem II .31 we bound the rate only when the density is very high, namely, u:{'n3^'^). 
We show, in Lemma 14.21 that any bound for density 0{'nfl~^) would once more rule 
out the existence of c^-LTCs. It seems that our upper bound of uj{n'i~'^) can be made 
to meet the lower bound, possibly by arguments similar to those in the proof of The- 
orem 11.21 

Related work. In the course of writing our result we have learned that Eli Ben-Sasson 
and Michael Viderman have also been studying the connection between density and rate and 
have obtained related results, through seemingly different methods. 

2 Moderately dense 3-query LTCs cannot be (? 

In this section we prove Theorem 11.21 which we now recall: 

Theorem 11.21 Let C C {0, 1}" be a 3-query LTC with distance 5, and let H he an ((5/3,£:)- 
test-hypergraph with density d and LOF constraints. Then, the rate ofC is at most 0(1/ d^^"^) . 

In order to prove the main theorem, we consider the hypergraph H = (y,E{H)) whose 
vertices are the coordinates of the code, and whose hyper-edges correspond to the different 
tests of the tester. By assumption, H has dn distinct hyper-edges. We describe an algorithm 
in Figure [T] for assigning values to coordinates of a codeword, and show that a codeword is 
determined using k = 0{--j^) bits. 



We need the following definition. For a partition {A, B) of the vertices V of H, we define 
the graph Gb = {A, E) where 



A single edge {ai, 02} G E{Gb) may have more than one "preimage", i.e., there may be two 
(or more) distinct vertices b,b' E B such that both hyper-edges {oi, 02, 6} , {ai, 02, are 
in H. For simplicity we consider the case where the constraints are lineaio which implies 
that for every codeword w G C: Wb = Wb>- This is a source of some complication for our 
algorithm, which requires the following definition. 

Definition 2.1. Two vertices v,v' are equivalent if 



Clearly this is an equivalence relation. A vertex has multiplicity m if there are exactly m 
vertices in its equivalence class. The reader is invited to assume, at first read, that all 
multiplicities are 1. 

Denote by V* the set of vertices whose multiplicity is at most fid^^"^ for /3 = a/16. 



E = {{ai, 02} C A I 36 G B, {ai, 02, b} G E{H)} . 



\/w G C, 



Wji = W„i. 
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0. Init: Let a = 3e/S and fix /3 = a/16. Let B contain all vertices with multiplic- 
ity at least {id}/"^. Let F contain a representative from each of these multiplicity 
classes. Let B also contain all "fixed" vertices (whose value is the same for all 
codewords). 

1. Clean: Repeat the following until B remains fixed: 

(a) Add to B any vertex that occurs in a hyper-edge that has two endpoints 
in B. 

(b) Add to B all vertices in a connected component of whose size is at 
least I3d^^'^, and add an arbitrary element in that connected component 
into F. 

(c) Add to B any vertex that has an equivalent vertex in B. 

2. S-step: Each vertex outside B tosses a biased coin and goes into S with prob- 
ability Let B ^ BUS andset F ^ FU S. 

3. If there are at least two distinct x,y E C such that xb = Vb goto step [H 
otherwise halt. 

Figure 1: The Algorithm 

The following lemma is easy. 

Lemma 2.2. If the algorithm halted, the code has at most 2^^^ words. 

Proof. This follows since at each step setting the values to vertices in F already fully deter- 
mines the value of all vertices in B (in any valid codeword). Once the algorithm halts, the 
values of a codeword on B determines the entire codeword. Thus, there can be at most 21'^' 
codewords. ■ 

Let Bt denote the set B at the end of the t-th Clean step. In order to analyze the expected 
size of F when the algorithm halts, we analyze the probability that vertices not yet in B 
will go into B on the next iteration. For a vertex v, this is determined by its neighborhood 
structure. Let 

= {{u,u'} I u,u' e V*, and {u,u',v} G E{H)} 
be a set of edges. Denote by A the vertices v with large \Ey\, 

A = {v\ \E^\ >ad}. 

The following lemma says that if v has sufficiently large E^ then it is likely to enter B in 
the next round: 

Lemma 2.3. Fort >2, if v E A then 

Pr[v E B,] > 1. 

^More generally, when the constraints are LOF the set of all such 6's can be partitioned into all those 
equal to Wb and all those equal to 1 — Wb- 
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Next, consider a vertex v ^ A that is adjacent, in the graph to a vertex v' G A. 

This means that there is a hyper-edge h = {v,v',b} where b G Bt_i. If it so happens that 
v' G Bt (and the above lemma guarantees that this happens with probabihty > |), then the 
hyper-edge h would cause v to go into Bt as well. In fact, one can easily see that if v goes 
into Bt then all of the vertices in its connected component in Gst^i will go into Bt as well 
(via step [Ta|) . Let At be the set of vertices outside Bt that are in A or are connected by a 
path in to some vertex in A. We have proved 

Corollary 2.4. For t>2, let v E At-i then 

Prb e Bt] > \. 



Lemma 2.5. // the algorithm hasn't halted before the t-th step and \At\ < |n then the 
algorithm will halt at the end of the t-th step. 

Before proving the two lemmas, let us see how they imply the theorem. 

Proof, (of theorem) For each t > 2, Corollary 12.41 implies that for each v E At half of the 
5"s put it in Bt. We can ignore the sets S whose size is above 2 ■ n/d^^"^, as their fraction is 
negligible. By linearity of expectation, we expect at least half of At to enter Bt. In particular, 
fix some St-i to be an 5* that attains (or exceeds) the expectation. As long as \At\ > 5n/2 
we get 

\Bt\ > \Bt-i\ + \At\ 12 > \Bt-i\ + 5n/4:. 

Since \Bt\ < n after £ < 4/5 iterations when the algorithm runs with 5*1, . . . , 5*^ we must 
have \A(\ < 6n/2. This means that the conditions of Lemma [2.51 hold, and the algorithm 
halts. 

How large is the set F7 In each S'-step the set F grew by IS"! < 2n/d^^'^ (recall we 
neglected S"s that were larger than that). The total number of vertices that were added to 
F in ^-steps is thus 0{i ■ n/d^l'^). 

Other vertices are added into F in the init step and in step [Tbl In both of these steps 
one vertex is added to F for every f3d^/'^ vertices outside B that are added into B. Since 
vertices never exit 5, the total number of this type of F-vertices is n/{(5d^/'^). 

Altogether, with non-zero probability, the final set F has size 0{-^) ■ n. Together with 
Lemma [2.21 this gives the desired bound on the number of codewords and we are done. ■ 

We now prove the two lemmas. 
2.1 Proof of Lemma [2731 

We fix some v E A. If f G Bt-i then we are done since Bt ^ Bt^i. So assume v ^ Bt^i 
and let us analyze the probability of v entering Bt over the random choice of the set S at 
iteration t — 1. This is dictated by the graph structure induced by the edges of E^. Let us 
call this graph G = {U, E^), where U contains only the vertices that touch at least one edge 
of Ey. We do not know how many vertices participate in U, but we know that > ad. 
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We begin by observing that all of the neighbors of u must be in the same multiplicity 
clas^. Indeed each of the edges {v,u,Ui} is a hyper-edge in H and the value of Ui is 
determined by the values of v and u. Therefore, the degree in G of any vertex -u G f/ is 
at most since vertices with higher multiplicity are not in V* and therefore do not 

participate in edges of Ey. 

For each u ^ U let lu be an indicator variable that takes the value 1 iff there is a neighbor 
of u that goes into S. If this happens then either 

• u E S: this means that v has a hyperedge whose two other endpoints are in Bt and 
will itself go into Bt (in step [Taj). 

• u ^ S: this means that the graph will have an edge {v, u}. 

If the first case occurs for any m G f/ we are done, since v goes into Bt in step [Tal Otherwise, 
the random variable J2ueu counts how many distinct edges {v,u} will occur in G^t- If 
this number is above I3d^^'^ then v will go into Bt (in step [Tb|) and we will again be done. It 
is easy to compute the expected value of I. First, observe that 

E[/.„] = 1 - (1 - 

where deg{u) denotes the degree of m in G and since the degree of u is at most /3d^/^, this 
value is between deg{u)/2d^^'^ and deg{u) / d^^"^ . By linearity of expectation 

E[/] = 5^E[/J > ^deg{u)/2d^l^ = \E,\ d'^'^ > ad^'\ 

u u 

We will show that / has good probability of attaining a value near the expectation (and in 
particular at least ad^^'^ /2 > and this will put v in Bt at step[lbl The variables are 

not mutually independent, but we will be able to show sufficient concentration by bounding 
the variance of /, and applying Chebychev's inequality. 

The random variables and /„' are dependent exactly when u, u' have a common neigh- 
bor (the value of lu depends on whether the neighbors of u go into S). We already know 
that having a common neighbor implies that m, u' are in the same multiplicity class. Since 
U <ZV* ^ this multiplicity class can have size at most (3d^^'^. This means that we can partition 
the vertices in U according to their multiplicity class, such that /„ and /„/ are fully inde- 
pendent when M, u' are from distinct multiplicity classes. Let Mi, . . . , be representatives of 
the multiplicity classes, and let di < jSd^^'^ denote the size of the ith multiplicity class. Also, 
write M ~ tt' if they are from the same multiplicity class. 

^Or, more generally for LOF constraints, in one of a constant number of multiplicity classes. 
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Var[I]=E[I^]-{E[I]y = E /„/«'- E/«E4' 

u,u' u,u' 

U^u' Urpu' U,u' 

E E E 



< 



< 



iS^degiu) = 2l3\E^\ 



By Chebychev's inequality, 



Pr[|/-E[/]| > a] < Var[I]/a'^ 

Plugging in a = E[/]/2 we get 



Pr 



|/-E|/1|>M 



VnrlT] 1 



and so by choosing /3 = a/16 this probability is at most a half. Thus, the probability that 
I > E//2 > is at least a half. As we said before, whenever / > fid^^"^ we are 

guaranteed that v will enter Bt in the next Clean step [lb] and we are done. ■ 

2.2 Proof of Lemma [275] 

We shall prove that if the algorithm hasn't halted before the t-th step and \At\ < |n then 
\Bt\ > {1 — 6)n. This immediately implies that the algorithm must halt because after fixing 
values to more than 1 — 5 fraction of the coordinates of a codeword, there is a unique way 
to complete it. 

Recall that A is the set of all vertices v for which l^'^l > ad. The set Bt is the set B in 
the algorithm after the t-th Clean step. The set At is the set of vertices outside Bt that are 
connected by a path in Gst to some vertex in A. Finally, denote G = Gsf 

Assume for contradiction that \Bt\ < (1 — 5)n and \At\ < 6n/2. This means that 
Z = V \ {At U Bt) contains more than 6n/2 vertices. Since Z n A = (p, every vertex v E Z 
has < ad. Out contradiction will come by finding a vertex in Z with large E^. If the 
algorithm doesn't yet halt, there must be two distinct codewords x,y E G that agree on Bt. 
Let Uxj^y = {u E V \ x-a ^ Hv}. This is a set of size at least 6n tht is disjoint from Bt. Since 
\At\ < dn/2 there must be at least 6n/2 vertices in Z n Ux^y. Suppose -u G Z fl Ux^y and 
suppose u' is adjacent to u in G. First, by definition of Z, m G Z implies u' G Z. Next, we 
claim that u G Ux^y implies u' G Ux^y Otherwise there would be an edge {m, u' , h} G E{H) 
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such that b & Bf, and such that Xu 7^ Vu but both Xu' = Uu' and Xf, = yi,. This means that 
either x or y must violate this edge, contradicting the fact that all hyper-edges should accept 
a legal codeword. We conclude that the set Z n t/^^^j/ is a union of connected components 
of G. Since each connected component has size at most (3d^^'^ (otherwise it would have gone 
into ! i? in a previous Clean step) we can find a set D C Z n Ux^y of size s, for 



that is a union of connected components, i.e. such that no G-edge crosses the cut between 
D and V \ D. Now define the hybrid word 



that equals x on D and y outside D. We claim that dist{w, C) = dist{w, y) = \D\ /n> 6/3. 
Otherwise there would be a word z & C whose distance to w is strictly less than \D \ /n < 
5/2 which, by the triangle inequality, would mean it is less than 5n away from y thereby 
contradicting the minimal distance 5n of the code. 
Finally, we use the fact that C is an LTC, 



Clearly to reject w a hyperedge must touch D. Furthermore, such a hyperedge cannot 
intersect on 2 vertices because then the third non-Sj vertex also belongs to Bt- It cannot 
intersect Bt on 1 vertex because this means that either the two other endpoints are both 
in D, which is imopssible since such a hyperedge would reject the legal codeword x as well; 
or this hyperedge induces an edge in G that crosses the cut between D and V \D. Thus, 
rejecting hyper-edges must not intersect Bt at all. 

Altogether we have edn rejecting hyperedges spanned on V \ Bt such that each one 
intersects D. This means that there must be some vertex v & D that touches at least 
edn/{6n/3) = ad rejecting hyperedges. Recall that D G Z is disjoint from A, which means 
that \Ey \ < ad. On the other hand, each rejecting hyperedge touching v must add a distinct 
edge to Ey. Indeed recall that Ey contains an edge {u,u'} for each hyperedge {u,u',v} such 
that u,u' G V* and where V* is the set of vertices with multiplicity at most f3d^^'^. The 
claim follows since obviously all of the ad rejecting hyperedges are of this form (they do not 
contain a vertex of high multiplicity as these vertices are in B). ■ 



In this section we prove the following theorem. 

Theorem 11.31 Let C C {0, 1}" he a q-query LTC with distance 6, and let H be an {6/2,6)- 
test-hypergraph with density A, where A = dnf'"^, and LOF constraints. Then, the rate of 
C is at most 0{l/d). 

Our proof is similar to the proof in the previous section. We describe an algorithm for 
assigning values to coordinates of a codeword, and show that a codeword is determined 




W = XDyv\D 





3 Very dense q-hTCs cannot be 
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using k < n - 0{l/d) bits. As in the previous section, we use the following definitions. For a 
partition {A,B) of the vertices V of H, we define the 2-graph Gb = {A,E) where 

E = {{ai, as} C A \ 3^3, ■■■\eB, {oi, aa, &3, ■ ■ ■ , \] e E{H)} . 

0. Init: Let = 0, F = 0. Let a = ^, /3 = a/6«. 

1. Clean: Repeat the following until B remains fixed: 

(a) Add to B any vertex that occurs in a g-edge that has q — 1 endpoints in 
B. 

(b) Add to B all vertices in a connected component of whose size is at 
least and add an arbitrary element in that connected component into 
F. 

2. S-step: Each vertex outside B tosses g — 2 independent biased coins that get 
1 with probability p = 6'^ /ad. A vertex goes into S if it got 1 in at least one of 
the q — 2 coin tosses. Let B B U S and set F <(— F U S*. 

3. If there are at least two distinct x,y E C such that xb = Ub goto step [H 
otherwise halt. 

Figure 2: The Algorithm 

The following two lemmas imply the theorem. 
Lemma 3.1. If the algorithm halted, the code has at most 2'-^' words. 

Proof. Identical to the case of 3-queries. ■ 

Lemma 3.2. Let Bt denote the set B at the end of the t-th Clean step. Let v he a vertex 
whose H degree is at least aA. Then if v ^ Bt-i the probability over the choice of S that 
V E Bt is at least ^. 

Before proving the lemma, let us see how it implies the theorem. 

Proof, (of theorem) Let L denote the vertices of degree less than aA. First, we prove that 
\L\ < 6n/2. Otherwise, \L\ > 6n/2 and let L' G L he an arbitrary subset of L of size 6n/2. 
Let X E C and consider the hybrid word w defined to equal x outside of L' and 1 — x on L'. 
Clearly 

dist{w,C) = dist(w,x) = 6/2 

since were there a closer word x' ^ x to w it would be less than 6 away from x by the triangle 
inequality. By the {S/2, e)-LTC property we know that w is rejected with probability at least 
e, i.e., it is rejected by at least en A hyperedges. But simple averaging shows there must be 
a vertex in L' touching at least enA/{Sn/2) = a A hyperedges, contradicting the definition 
of L' C L. 
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Denote by Bf the set B after the t-th Clean step. Also denote At = V \ {Bt U L). Let 

V G At, then by Lemma [3.21 1; will enter -Bf+i with probability at least 1/2. We expect, over 
the choice of S that half of the vertices of At will go into -Bt+i, and thus 

E5[|A+i|] < \At\/2 

Let S^^\ . . . , 5**^*^ be the sets that attain or exceed this expectation at steps 1, . . . , t (again, 
wlog we ignore sets 5* whose size deviates from their expected size which is at most qn/d). If 
the algorithm chooses these sets S^^\ S^'^\ . . . then at the t-th. step we have |At+i| < 1/2* ■ n. 
For t = log 2/5 this is no larger than 6n/2. Since L too is smaller than 6n/2, we deduce that 
|St+i| > (1 — 6)n and the algorithm must halt. 

The size of the set F when the algorithm halts is no more than log 2/5 times twice the 
expected size of 5* (which is at most 0{n/d)), plus no more than n/{f3d) (from the Clean 
steps). Altogether this is 0{n/d) and this bounds the rate by 0(1 /c?). ■ 

Let us now prove Lemma [3.21 

Proof. Consider the set of (g — l)-edges 

Nq^i{v) = {{ui, ■ ■ ■ , Ug-i} I {ui, ■ ■ ■ , Ug-i,v} e E{H)} . 

While we know that \Nq_i{v)\ > aA = adn'^~'^, we do not know how many vertices partici- 
pate in these edges. Let us fix some arbitrary order converting each subset in Nq_i{v) to an 
ordered tuple. 

Each vertex v outside B tosses q — 2 independent coins each has probability p = ^ of 
getting 1. Let Si, 1 < i < q — 2, he the set of vertices that their i-th coin toss is 1. A vertex 

V goes into S if it gets 1 in at least one of the q — 2 independent coin tosses, i.e. S is the 
union of all 5'j's. 

For l<'j<g — 2we define Ni{v) similar to the above. Namely 

Ni{v) = {(mi, ■ ■ ■ ,Ui)\ (ui, ■■■ ,Ui,x) e Ni+i{v) and x e Sg_i_i} . 

We call the elements in Ni{v) i-edges (even for i = 1). 

Our goal is to show that with probability greater than | over the selection of S, the set 
Ni{v) is of size greater than a-^. This would suffice to prove the lemma since this means 
that w is in a large connected component and will go into B in the next iteration. 

An z-edge {ui, ■ ■ ■ , Ui} is called h-heavy in Ni+i{v) if the number of distinct x's for which 
{ui, ■ ■ ■ ,Ui,x} E iVj+i(f ) is at least h. For 1 < i < g — 2, let Hi{v) be the set of z-edges that 
are ^.^^-i-i heavy in Ni+i{v). 

Claim 3.3. \Hi{v)\ > ^^^i±iM assuming \Ni+i{v)\ > 



Proof. Indeed, otherwise the number of i + 1-edges in Ni^i{v) is too low, namely, at most 

(yd 

number of heavy i-edges ■ n + number of non-heavy i-edges 



2 ■ 59-2-*' 
This is smaller than 



2n 2-5«-2-* 2 2-5'?-2-^ 
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We next show that edges in Hi{v) have very high probabihty of being selected into Ni{v). 

Claim 3.4. For 1 <i < q — 2, an edge in Hi{v) goes into Ni{v) with probability greater than 
Pi =^ 1 — (1 — p)°^'^/2-5'' ^ ' > 1 — over the selection of Si, - ■ ■ , Si 



Proof. Consider {ui, ■ ■ ■ ,Ui} G Hi{y). By the definition of Hi{v) there are at least ir 



ad 



2-59 



distinct x's such that ■ ■ ■ ,Ui,x} e A^j+i(w), • ■ ■ goes into Ni{v) if at least one 
of these distinct x's is selected into Sq^i^i. The probability that at least one is selected into 
is Pi = 1 - (1 - p)"'i/2-5«-2-\ Note that since p = Q^/ad, Pi > 1 - ■ 

We are now ready to show that for 1 < i < q—2, with probability greater than (1—^)* > | 
over the selection of S*!, ■ ■ ■ , Si, \Ni{v)\ > adn'-'^ /5'^~'^~\ Note that this implies that 

\Ni{v)\ > ad/5''-^ > ad/&. 

This implies that f is in a large connected component and hence will enter into B in the 
next iteration. 

Claim 3.5. Fori <i <q-2, let Ni = \Ni{v)\. 



Pr (AT, >^iV,+i 
Sq-i-i \ bn 



Proof. For every e G Hi{v) we define an indicator random variable le that gets 1 iff e is 
selected into Ni{v), otherwise /g is 0. Let / = XleeJ? («) -^e- By Claim 1331 we have that if 
iV,+i > adn^b'^-'-' = j-^N,^i then \Hi{v)\ > Thus, 

E[N.,]>E[I]=p,\H,{v)\. 
The variance of / can be bounded as follows. 



Var[I] 



ei,e2€Hi{v) 

mvnp.-p^)=E'[i]{--i)<E'[i]j- 



Pi 



iq 



The last inequality holds since Pi > 1 — which implies ^ — 1 < ^ 
By Chebychev's inequality, 

Pr[|/-E[/]| > a] < Var[I]/a^ 
Plugging in a = E[/]/2 we get 



Pr 



|J-E[/]| > 



m' 



< 



2q 
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Thus, the probabihty that / > E[J/2] > pi\Hi{v)\/2 > (1 - ■ ^ > ^ is at least 



;i - l/2g). Thus, 



Pr (iV,>^iV,+i 
Sq-i-i \ bn 



As a corollary of the last claim (Claim [375]) we get the desired bound on Ni{v): 
Corollary 3.6. 

Pr {\N,{v)\>ad/b'^)>\ 

Proof. We prove by downwards induction on i that 

Pr [Ni > , } ^ . Ar„_i) > (1 - —y-^-K 
(571)9-1-' 9 i; V 

For i = q — \ this holds with probability 1. By Claim 13.51 if the above holds for i + 1 
then it holds for i. ■ 

The last corollary establishes the proof the the lemma. 



4 Exploring possible improvements 
4.1 Tradeoff between rate and density 

Any improvement over our bound of p < l/rf^/^, say to a bound of the form p < 
would already be strong enough to rule out c^-LTCs (with a non-weighted tester) regardless 
of their density. The reason for this is the following reduction by Oded Goldreich. 
Suppose, for the sake of contradiction, that there is some family 

Lemma 4.1. Suppose for some q > 3 and some e > the following were true. 

For any family {C^} of q- query LTCs with rate < p such that each 
has a tester with density at least d, then p < l/d~^'^ . 

Then, there is no family of q- query LTCs with constant rate and any density, such that the 
tester is non-weighted. 

Proof. Let (3 = + ^5 t G N. Let {Cj} be an infinite family of g-LTCs with density 

d = 0(1) and relative rate p = Q{1). Then there is another infinite family of g-LTCs 

with density d ■ f^'^ and relative rate p/t. Ci is constructed from Ci by duplicating each 
coordinate t times and replacing each test hyper-edge by g* hyperedges. Clearly the density 
and the rate are as claimed. The testability can also be shown. Plugging in the values 
p = p/t and d = df^'^ into the assumption we get 

p/t = p< i/d^ = i/idf-y 
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In other words pd^ < Since t is unbounded this can hold only if the exponent of t 

is positive, i.e., (3 < l/{q — 1), a contradiction. ■ 

4.2 For q > 3 density must be high 

Lemma 4.2. Let C be a q-LTC with rate p, and density d. Then there is a {q + q')-LTC C 
with density d ■ (^,) such that C has rate p/2, distance 5/2. 

Corollary 4.3. If there is a 3-LTC with constant rate and density, then there are LTCs with 
q > 3-queries, constant rate, and density Q{n'^^'^). 

The corollary shows that our upper bounds from Theorem II. 31 are roughly correct in their 
dependence on n, but there is still a gap in the exponent. 

Proof, (of lemma) Imagine adding another n coordinates to the code C such that they are 
always all zero. Clearly the distance and the rate are as claimed. For the testability, we 
replace each g-hyper-edge e of the hypergraph of C with (^,) new hyperedges that consist 
of the vertices of e plus any q' of the new vertices. The test associated with this hyperedge 
will accept iff the old test would have accepted, and the new vertices are assigned 0. It is 
easy to see that the new hypergraph has average degree d ■ (";). Testability can be shown as 
well. ' m 

4.3 Allowing weighted hypergraph-tests 

In this section we claim that when considering hyper-graph tests with weights, the density 
should not be defined as the ratio between the number of edges and the number of vertices. 
Perhaps a definition that takes the min-entropy of the graph into consideration would be 
better-suited, but this seems elusive, and we leave it for future work. 

We next show that if one defines the density like before (ignoring the weights) then every 
LTC can be modified into one that has a maximally- dense tester. This implies that bounding 
the rate as a function of the density is the same as simply bounding the rate. 

Lemma 4.4. Let C be a q-LTC with q > 3, rate p, distance 6, and any density. Then there 
is another q-LTC C with a weighted -tester of maximal density f2(n^~^) such that C has 
rate p/2, distance 5/2. 

Corollary 4.5. Let / : N — ?■ N 6e any non- decreasing non-constant function. Any bound of 
the form p < 1 / f {d) for weighted testers implies p < l/f{n'^~^), and in particular p ^ Q . ■ 

Proof, (of lemma:) One can artificially increase the density of an LTC tester hypergraph 
H by adding n new coordinates to the code that are always zero, and adding all possible 
g- hyperedges over those coordinates (checking that the values are all- zero). All of the new 
hyper-edges will be normalized to have total weight one half, and the old hyperedges will 
also be re-normalized to have total weight one half. Clearly the rate and distance have been 
halved, and the testability is maintained (with a different rejection ratio). However, the 
number of hyperedges has increased to n'^ so the density is as claimed. ■ 
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